AITopics | joint strategy

Collaborating Authors

joint strategy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

525b8410cc8612283c9ecaf9a319f8ed-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-12-2026, 04:35:57 GMT

cfr-jr, cfr-s, rev, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.31)

Add feedback

Rev.1 on Th . 4 and

Neural Information Processing SystemsOct-2-2025, 17:48:04 GMT

Th. 4 and 5 should be evaluated as the key building blocks of CFR-Jr, which is shown to Th. 4 is necessary to show the soundness of the reconstruction algorithm, Th. 5 shows that CFR-Jr approaches the set of CCEs, CFR [43], because our reconstruction procedure does not alter the way in which regret is minimized. We will clarify this in the paper. In general, CFR-S performs worse than CFR-Jr, as it needs much more iterations to converge. In practice, CFR-Jr allows to build dramatically smaller solutions, e.g., the figure displays the percentage difference The figure considers G2-4 with different tie-breaking rules. Rev.1 for more details on how we compute the social welfare ratio.

artificial intelligence, cfr-jr, rev, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.31)

Add feedback

Grouped Satisficing Paths in Pure Strategy Games: a Topological Perspective

Fu, Yanqing, Huang, Chao, Wang, Chenrun, Wang, Zhuping

arXiv.org Artificial IntelligenceSep-30-2025

In game theory and multi-agent reinforcement learning (MARL), each agent selects a strategy, interacts with the environment and other agents, and subsequently updates its strategy based on the received payoff. This process generates a sequence of joint strategies $(s^t)_{t \geq 0}$, where $s^t$ represents the strategy profile of all agents at time step $t$. A widely adopted principle in MARL algorithms is "win-stay, lose-shift", which dictates that an agent retains its current strategy if it achieves the best response. This principle exhibits a fixed-point property when the joint strategy has become an equilibrium. The sequence of joint strategies under this principle is referred to as a satisficing path, a concept first introduced in [40] and explored in the context of $N$-player games in [39]. A fundamental question arises regarding this principle: Under what conditions does every initial joint strategy $s$ admit a finite-length satisficing path $(s^t)_{0 \leq t \leq T}$ where $s^0=s$ and $s^T$ is an equilibrium? This paper establishes a sufficient condition for such a property, and demonstrates that any finite-state Markov game, as well as any $N$-player game, guarantees the existence of a finite-length satisficing path from an arbitrary initial strategy to some equilibrium. These results provide a stronger theoretical foundation for the design of MARL algorithms.

equilibrium, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2509.23157

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Ranking Joint Policies in Dynamic Games using Evolutionary Dynamics

Koliou, Natalia, Vouros, George

arXiv.org Artificial IntelligenceFeb-20-2025

Game-theoretic solution concepts, such as the Nash equilibrium, have been key to finding stable joint actions in multi-player games. However, it has been shown that the dynamics of agents' interactions, even in simple two-player games with few strategies, are incapable of reaching Nash equilibria, exhibiting complex and unpredictable behavior. Instead, evolutionary approaches can describe the long-term persistence of strategies and filter out transient ones, accounting for the long-term dynamics of agents' interactions. Our goal is to identify agents' joint strategies that result in stable behavior, being resistant to changes, while also accounting for agents' payoffs, in dynamic games. Towards this goal, and building on previous results, this paper proposes transforming dynamic games into their empirical forms by considering agents' strategies instead of agents' actions, and applying the evolutionary methodology $\alpha$-Rank to evaluate and rank strategy profiles according to their long-term dynamics. This methodology not only allows us to identify joint strategies that are strong through agents' long-term interactions, but also provides a descriptive, transparent framework regarding the high ranking of these strategies. Experiments report on agents that aim to collaboratively solve a stochastic version of the graph coloring problem. We consider different styles of play as strategies to define the empirical game, and train policies realizing these strategies, using the DQN algorithm. Then we run simulations to generate the payoff matrix required by $\alpha$-Rank to rank joint strategies.

agent, matrix, strategy profile, (14 more...)

arXiv.org Artificial Intelligence

2502.14724

Country:

North America > United States > Michigan > Wayne County > Detroit (0.06)
North America > United States > New York > New York County > New York City (0.04)
Europe > Greece (0.04)
(6 more...)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games > Chess (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Learning Meta Representations for Agents in Multi-Agent Reinforcement Learning

Zhang, Shenao, Shen, Li, Han, Lei, Shen, Li

arXiv.org Artificial IntelligenceJun-5-2023

In multi-agent reinforcement learning, the behaviors that agents learn in a single Markov Game (MG) are typically confined to the given agent number. Every single MG induced by varying the population may possess distinct optimal joint strategies and game-specific knowledge, which are modeled independently in modern multi-agent reinforcement learning algorithms. In this work, our focus is on creating agents that can generalize across population-varying MGs. Instead of learning a unimodal policy, each agent learns a policy set comprising effective strategies across a variety of games. To achieve this, we propose Meta Representations for Agents (MRA) that explicitly models the game-common and game-specific strategic knowledge. By representing the policy sets with multi-modal latent policies, the game-common strategic knowledge and diverse strategic modes are discovered through an iterative optimization procedure. We prove that by approximately maximizing the resulting constrained mutual information objective, the policies can reach Nash Equilibrium in every evaluation MG when the latent space is sufficiently large. When deploying MRA in practical settings with limited latent space sizes, fast adaptation can be achieved by leveraging the first-order gradient information. Extensive experiments demonstrate the effectiveness of MRA in improving training performance and generalization ability in challenging evaluation games.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2108.12988

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.47)

Add feedback

Cooperative Concurrent Games

Gutierrez, Julian, Kowara, Szymon, Kraus, Sarit, Steeples, Thomas, Wooldridge, Michael

arXiv.org Artificial IntelligenceJan-15-2023

In rational verification, the aim is to verify which temporal logic properties will obtain in a multi-agent system, under the assumption that agents ("players") in the system choose strategies for acting that form a game theoretic equilibrium. Preferences are typically defined by assuming that agents act in pursuit of individual goals, specified as temporal logic formulae. To date, rational verification has been studied using non-cooperative solution concepts - Nash equilibrium and refinements thereof. Such non-cooperative solution concepts assume that there is no possibility of agents forming binding agreements to cooperate, and as such they are restricted in their applicability. In this article, we extend rational verification to cooperative solution concepts, as studied in the field of cooperative game theory. We focus on the core, as this is the most fundamental (and most widely studied) cooperative solution concept. We begin by presenting a variant of the core that seems well-suited to the concurrent game setting, and we show that this version of the core can be characterised using ATL*. We then study the computational complexity of key decision problems associated with the core, which range from problems in PSPACE to problems in 3EXPTIME. We also investigate conditions that are sufficient to ensure that the core is non-empty, and explore when it is invariant under bisimilarity. We then introduce and study a number of variants of the main definition of the core, leading to the issue of credible deviations, and to stronger notions of collective stable behaviour. Finally, we study cooperative rational verification using an alternative model of preferences, in which players seek to maximise the mean-payoff they obtain over an infinite play in games where quantitative information is allowed.

artificial intelligence, deviation, game theory, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.artint.2022.103806

2301.06157

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
(11 more...)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Game Theoretic Rating in N-player general-sum games with Equilibria

Marris, Luke, Lanctot, Marc, Gemp, Ian, Omidshafiei, Shayegan, McAleer, Stephen, Connor, Jerome, Tuyls, Karl, Graepel, Thore

arXiv.org Artificial IntelligenceOct-5-2022

Rating strategies in a game is an important area of research in game theory and artificial intelligence, and can be applied to any real-world competitive or cooperative setting. Traditionally, only transitive dependencies between strategies have been used to rate strategies (e.g. Elo), however recent work has expanded ratings to utilize game theoretic solutions to better rate strategies in non-transitive games. This work generalizes these ideas and proposes novel algorithms suitable for N-player, general-sum rating of strategies in normal-form games according to the payoff rating system. This enables well-established solution concepts, such as equilibria, to be leveraged to efficiently rate strategies in games with complex strategic interactions, which arise in multiagent training and real-world interactions between many agents. We empirically validate our methods on real world normal-form data (Premier League) and multiagent reinforcement learning agent evaluation.

artificial intelligence, game theory, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2210.02205

Country:

Europe > United Kingdom > England > Leicestershire > Leicester (0.07)
Europe > United Kingdom > England > West Yorkshire > Huddersfield (0.07)
Europe > United Kingdom > England > Dorset > Bournemouth (0.07)
(6 more...)

Genre:

Overview (0.46)
Research Report (0.40)

Industry:

Leisure & Entertainment > Games (1.00)
Leisure & Entertainment > Sports > Soccer (0.72)
Leisure & Entertainment > Sports > Tennis (0.47)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Policy Evaluation and Seeking for Multi-Agent Reinforcement Learning via Best Response

Yan, Rui, Duan, Xiaoming, Shi, Zongying, Zhong, Yisheng, Marden, Jason R., Bullo, Francesco

arXiv.org Machine LearningJun-20-2020

This paper introduces two metrics (cycle-based and memory-based metrics), grounded on a dynamical game-theoretic solution concept called sink equilibrium, for the evaluation, ranking, and computation of policies in multi-agent learning. We adopt strict best response dynamics (SBRD) to model selfish behaviors at a meta-level for multi-agent reinforcement learning. Our approach can deal with dynamical cyclical behaviors (unlike approaches based on Nash equilibria and Elo ratings), and is more compatible with single-agent reinforcement learning than alpha-rank which relies on weakly better responses. We first consider settings where the difference between largest and second largest underlying metric has a known lower bound. With this knowledge we propose a class of perturbed SBRD with the following property: only policies with maximum metric are observed with nonzero probability for a broad class of stochastic games with finite memory. We then consider settings where the lower bound for the difference is unknown. For this setting, we propose a class of perturbed SBRD such that the metrics of the policies observed with nonzero probability differ from the optimal by any given tolerance. The proposed perturbed SBRD addresses the opponent-induced non-stationarity by fixing the strategies of others for the learning agent, and uses empirical game-theoretic analysis to estimate payoffs for each strategy profile obtained due to the perturbation.

agent, joint strategy, sink equilibrium, (14 more...)

arXiv.org Machine Learning

2006.09585

Country:

South America > Brazil > São Paulo (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
(8 more...)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Resolving Congestions in the Air Traffic Management Domain via Multiagent Reinforcement Learning Methods

Kravaris, Theocharis, Spatharis, Christos, Bastas, Alevizos, Vouros, George A., Blekas, Konstantinos, Andrienko, Gennady, Andrienko, Natalia, Garcia, Jose Manuel Cordero

arXiv.org Artificial IntelligenceDec-14-2019

In this article, we report on the efficiency and effectiveness of multiagent reinforcement learning methods (MARL) for the computation of flight delays to resolve congestion problems in the Air Traffic Management (ATM) domain. Specifically, we aim to resolve cases where demand of airspace use exceeds capacity (demand-capacity problems), via imposing ground delays to flights at the pre-tactical stage of operations (i.e. few days to few hours before operation). Casting this into the multiagent domain, agents, representing flights, need to decide on own delays w.r.t. own preferences, having no information about others' payoffs, preferences and constraints, while they plan to execute their trajectories jointly with others, adhering to operational constraints. Specifically, we formalize the problem as a multiagent Markov Decision Process (MA-MDP) and we show that it can be considered as a Markov game in which interacting agents need to reach an equilibrium: What makes the problem more interesting is the dynamic setting in which agents operate, which is also due to the unforeseen, emergent effects of their decisions in the whole system. We propose collaborative multiagent reinforcement learning methods to resolve demand-capacity imbalances: Extensive experimental study on real-world cases, shows the potential of the proposed approaches in resolving problems, while advanced visualizations provide detailed views towards understanding the quality of solutions provided.

agent, flight, hotspot, (17 more...)

arXiv.org Artificial Intelligence

1912.0686

Country:

North America > United States (0.04)
Europe > Spain > Canary Islands (0.04)
Europe > Greece > Epirus > Ioannina (0.04)
Europe > Germany (0.04)

Genre:

Research Report > New Finding (0.66)
Research Report > Experimental Study (0.48)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Air (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Data-Driven Vehicle Trajectory Forecasting

Jawed, Shayan, Boumaiza, Eya, Grabocka, Josif, Schmidt-Thieme, Lars

arXiv.org Machine LearningFeb-9-2019

An active area of research is to increase the safety of self-driving vehicles. Although safety cannot be guarenteed completely, the capability of a vehicle to predict the future trajectories of its surrounding vehicles could help ensure this notion of safety to a greater deal. We cast the trajectory forecast problem in a multi-time step forecasting problem and develop a Convolutional Neural Network based approach to learn from trajectory sequences generated from completely raw dataset in real-time. Results show improvement over baselines.

sequence, trajectory, vehicle, (13 more...)

arXiv.org Machine Learning

1902.054

Country:

Europe > Germany (0.04)
North America > United States > California > Santa Clara County > Mountain View (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback